A Study of Automatic Speech Recognition in Noisy Classroom Environments for Automated Dialog Analysis

نویسندگان

  • Nathaniel Blanchard
  • Michael Brady
  • Andrew Olney
  • Marci Glaus
  • Xiaoyi Sun
  • Martin Nystrand
  • Borhan Samei
  • Sean Kelly
  • Sidney K. D'Mello
چکیده

The development of large-scale automatic classroom dialog analysis systems requires accurate speech-to-text translation. A variety of automatic speech recognition (ASR) engines were evaluated for this purpose. Recordings of teachers in noisy classrooms were used for testing. In comparing ASR results, Google Speech and Bing Speech were more accurate with word accuracy scores of 0.56 for Google and 0.52 for Bing compared to 0.41 for AT&T Watson, 0.08 for Microsoft, 0.14 for Sphinx with the HUB4 model, and 0.00 for Sphinx with the WSJ model. Further analysis revealed both Google and Bing engines were largely unaffected by speakers, speech class sessions, and speech characteristics. Bing results were validated across speakers in a laboratory study, and a method of improving Bing results is presented. Results provide a useful understanding of the capabilities of contemporary ASR engines in noisy classroom environments. Results also highlight a list of issues to be aware of when selecting an ASR engine for difficult speech recognition tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Learning Robust Dialog Policies in Noisy Environments

Modern virtual personal assistants provide a convenient interface for completing daily tasks via voice commands. An important consideration for these assistants is the ability to recover from automatic speech recognition (ASR) and natural language understanding (NLU) errors. In this paper, we focus on learning robust dialog policies to recover from these errors. To this end, we develop a user s...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015